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What  is  WAIS? 


WAIS  Clients  provide  simple  point-and-click 
interface 

Standard  protocol  connects  clients  to  data  servers 

WAIS  Servers  range  from  small  to  huge 

Improvements  in  servers  and  clients  are 
independent 
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WAIS 


The  WA1S  Protocol  is  WAIS 


Supports  any  search  syntax 

Supports  sophisticated  clients 
in  the  user's  hands 


puts  intelligence 


Clients  can  run  on  any  platform 

Multiple  servers  in  a  single  search 

Retrieve  any  kind  of  data:  text,  graphics,  video,... 
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Levels  of  Information 


Personal  files 
•   Workgroup  file  server 
•    Division  database 

•   Corporate/Organization  database 
•    Public  databases 


Goal:  Access  all  levels  from  one  interface 
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are  Components 


Connection  Machine 
CM2a 


Front-End 


GatorBox 

Gateway  from 

AppleTalk  to 

Ethernet 


Macintosh  running 
WAIStation  via  MacTCP 


Workstation  running 
WAIS  via  X-Windows 
orGMACS 


Why  a  Connection  Machine? 


Bigger  databases 

Interactive  full-text  search  on  gigabytes  to 
terabytes 

More  robust  search  techniques,  e.g.  relevance 
feedback,  weighted  terms 
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WAIS 


Data  Parallelism: 

Searching  all  the  documents  at  once 

Pharmaceutical  +12 


FDA +9 
Medical  +6 


Stadium 
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WAIS 


Connection  Machine  Server 


1-100GBytes  (and  getting  bigger) 

Supports  thousands  of  users 

Automatic  Indexing 

Uses  words  and  phrases  in  question  to  find 
appropriate  documents  with  relevance  feedback, 
weighted  term 

Supports  Boolean  Queries 

Cost  effective  hardware  alternative  to  mainframes 
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WAIS 


Document  Retrieval  Performance 


Current  algorithm  limits: 
~2GBwith512MBCM-2 
~8  GB  with  2  GB  CM-2 

-25  GB  with  8  GB  CM-2 


•  High  recall  "I  Stanfili  and  Kahle 

•  High  precision     J    see   Communications  of  the  ACM 

December  1986 

•  «  1  sec.  response 


Much  larger  DBs  searchable  with  CM-5 

and  inverted  index  algorithms:  100s  to  1000s  of  Gigabytes 
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DowQuest 

An  advanced  information  retrieval  service  offered  by 
Dow  Jones  News/Retrieval  since  January  1989 

Simple  and  powerful  search  by  example  model. 

Prime  the  system  with  a  few  words  to  find  an  article  you 
like. 

Search  again  using  good  article:  "Give  me  more  articles 
like  that." 

The  full  text  database  of  over  400  publications  is 
examined  and  compared  with  the  reference  article. 

16  top  scoring  "best  fit"  articles  retrieved  almost 
instantly. 

Process  is  repeated  until  you  find  just  the  information 
you  want. 
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WAIS 


Boolean  Search 


Conceptual  Search 


Retrieve  documents 

containing  specific 

combinations  of 

words 


Expiore  a  set  of 

documents 

containing  related 

concepts 
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WAIS 


Hard  to  Use: 
Complex  Syntax 


Boolean  Query 


(Japanese  OR  Japan)  AND 

(building  OR  buildings  OR  (Real  AND  Estate)  AND 

(Manhattan  OR  (New  AND  York) 


Poor  Results: 

The  wrong  information 

No  ranking  of  results 


Have  you  been  paying  attention?... 
Freer  Finance:  U.S.  Regulators  Move... 
REAL  ESTATE:  California  Initiatives... 
First  Boston  Said  To  Agree  on  Sale  Of... 
Exxon,  Rockefeller  Group  to  Sell  Site... 
What's  News-Business  and  Finance 
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WAIS 


Conceptual  Search:  Phase  1 


Easy  to  Use: 
No  Syntax 


Options: 

What  do  you 
want  to  follow 

up? 


[Japanese  buying  real  estate  in  mid-town  rnanhattan 


1 .  Time  Acts  to  Cut  Magazine  Costs... 

2.  First  Boston  Said  To  Agree  on  Sale... 

3.  Have  You  Been  Paying  Attention? 

4.  Exxon f  Rockefeller  Group  to  Sell  Site... 

5.  Hard  Sell:  Real  Estate  Developers... 

6.  What's  News-Business  and  Finance... 

7.  Integrated  Resources  Buys  Loft  Building, 
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WAIS 


Conceptual  Search:  Phase  2 


Relevance 
Feedback: 

I  like  these; 
show  me  more 


First  Boston  Said  To  Agree  on  Sale... 
Exxon,  Rockefeller  Group  to  Sell  Site. 


Improved  results: 

Articles  on  related 
topics  are  found 

Results  are  ranked 


1.  Bids  for  Exxon  Building  in  New  York... 

2.  Time  Acts  to  Cut  Magazine  Costs... 

3.  Hard  Sell:  Real  Estate  Developers... 

4.  Time  Inc.  Sells  Its  45%  Interest... 

5.  Citicorp  Unit  Moves  to  Foreclose  on... 

6.  Litigious  Landlords:  Legal  Maneuvers 
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WAIS 


Results  Improve  with  Query  Size 

.6 


Precision  x 

recall 

@  25%  recall 


Average 
performance 

over  13 
reference  sets 


,1 


t 


Typical 
Relevance 
Feedback 
Query 


t 


Typical 

Boolean 

Query 


10  20  30  40  50  60  70  80  90  10(T 

number  of  query 
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WAIS 


How  Standard  Protocol 
can  Provide  Security 


Users  do  not  login  to  server,  but  search  only 
through  application  layer  protocol  (Z39.50) 

Server  controls  access  to  data 

Network  layers  below  application,  or 
application  layer  handles  authentication, 
encryption,  billing 
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Public-Key  Encryption 
Fits  the  WAIS  model 


Private  data  can  be  exchanged  encrypted 
with  public-key  methods 

Once  server  has  authenticated  client,  can 
encrypt  data  with  client's  public  key 

Only  client  can  decrypt,  using  private  key 

Alternatively,  user  could  buy  key  from 
server 

Layers  below  Application  must  handle 
encryption/decryption 
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fitiman  intef  face 
devices  and 
technologies 

HyperCard  toolkits 

Unrest  and  revolution 
in  Soviet  Central      r 
Asia... .2 

Korea  j,  Malaysia,  Hong 
Kong ...3 

Japanese  business  and 
technology 3 

Semiconductor  news  ..4 

Eastern  Europe 4 

AIGA  lectures  and 
members .5 
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die  Reporters 


Boycotts  and  high 
technology 

Cold  fusion 

Australia  and  New 
Zealand 

Space  shuttle  and  ,  „ 

NASA  wj 
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Reporters 
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6  new  itemsi 


Motorola,  Hitachi  Reach  an 

Agreement  in  Two  Year  Lawsuit 

Key  word.  An  instance  lawsuit  variable 
is  a  special  kind  of  variable  used  with? 
factories. A Motorola  factory  can  ;  i; 
assign  instance  variables  to  specific?;':- 
objects.  Instance  varibles  contain  a  ■  ■ 
unique  set  of  values  for  each  individual 
object.  The  Hitachi  methods  of  a  fac-  ( 
tory  can  use  the  instance  variables.:     ;v 

i     ♦  ♦♦.*■> 

Abreast  tit  the  Market:  Stocks  Fall  on  ' 

Investor's  Returns 

An  instance  stacks  variable  is  a  special 
kind  of  variable  used  with  factories ,  A; 
Motorola  factory  can  as-  sign  instance  I 
variables  to  specific  objects.  Instance  :/ 
varibles  contain  a  unique,  set  Of  values  v 
for  each  individual  object.  The  Hitachi ;: 
methods  of  a  factory  can  use  the  inl^ 
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HcW  Human  Interface  Devices 
andTechnologtat  :r 
3  nev  items: 

The  Talking  Glow© 

Although  the  agent  in  New  V&ve's 
agent  fac  ttityis  a  long  way  from  agents 
as  envisioned  by  scienc  e  fiction  or  the 
Knowledge  Navigator,  it  offers 
func  tionaTity  integrated  into  an 
operating  system. 

H      ■♦♦:♦;■'-..■  -^ 

Pzotdtyping  and  IfeerTestlng 

After  the  initial  design  interviews,  we 
conducted  informal  user  tests.  Test 
subjects  were  Apple  employees  who 
planned  to  attend  the  conference. 

Animated  Agjentsr 

Users  were  given  the  task  of  finding 
certain  types  of  presentations.  For 
example;  we  instructed  them  to  find  a 
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The  Connection  Machine  and  Local  Area  Networks 

BYTE    8/31/89 
Boring  A.  Reporter 

T  he  re  %  r  e  t vo  LA  N  -  based  i  nf o  r  mati  o  n  serve  r s :  j 

1   Macintosh 
1  Small  Connection  Machine  System.  ' 

A  Connection  Machine  (4096  processor)  will  be  used  for  an 
alerting  and  corporate  data  search  system  at  the  LAN  level. 
Thi3  will  get  Its  data  from  the  Macintosh  X. 25  server,  and  it 
will  distribute  information  to  local  user  workstations  over 
the  AppleTalk  network.  The  front  end  that  will  be  used  for 
this  is  a  Sun4 configured  to  run  a  Connection  Machine  and 
which  can  communicate  with  AppleTalk -based  Macintoshes 
through  a  GatorBox  Ethernet/AppleTal  k  bridge. 

The  WAIS  system  will  be  the  DowQuest  information  system. 
The  DowQuest  system  will  be  connected  to  the  X. 25 
DowVision  interface  in  the  LAN  Macintosh  Server.  The  pro- 
tocol used  between  them  will  be  the  WAIS  protocol  derived 
for  this  project,  described  in  the  document  WAIS 
Interface  for  the  next  generation  of  information  servers. 

WAIS  is  a  short  term  collaborative  project  between  Apple, 
Dow  Jones,  Peat  Marwick  and  Thinking  Machines.  The  goal  of 
WAIS  is  to  put  a  functional  interface  for  information  access 
in  the  hands  of  non-technical  users  and  observe  their  •/. 
interaction  with  it;  ^    :p  ^;  S3 
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